Automated Arabic Antonym Extraction Using a Corpus Analysis Tool
نویسندگان
چکیده
The automatic extraction of semantic relations between words from textual corpora is an extremely challenging task. The increasing need for language resources supporting Natural language processing (NLP) applications has encouraged the development of automated methods for the extraction of semantic relations between words. The use of corpus statistical and similarity distribution methods can help in the task of semantic relation extraction between pairs of words. In this paper, we present a pattern-based bootstrapping approach using Arabic language corpora and a corpus analysis tool (Sketch Engine) to extract the semantic relations (antonyms) between word pairs. The algorithm uses LogDice and pattern cooccurrence to classify the extracted pairs into antonyms. Results of evaluation show that our approach is able to extract the antonym relations with a precision of 76%.
منابع مشابه
Positive Negative Neutral Sentiment Analysis Using Dual Sentiment Analysis
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good). Then begin by ident...
متن کاملArabic Entity Graph Extraction Using Morphology, Finite State Machines, and Graph Transformations
Research on automatic recognition of named entities from Arabic text uses techniques that work well for the Latin based languages such as local grammars, statistical learning models, pattern matching, and rule-based techniques. These techniques boost their results by using application specific corpora, parallel language corpora, and morphological stemming analysis. We propose a method for extra...
متن کاملArabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملIdentifying Metaphoric Antonyms in a Corpus Analysis of Finance Articles
Using a corpus of 17,000+ financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP and DOWN-verbs used to describe movements of indices, stocks and shares. In Study 1 people identified antonyms of these verb sets in a free-generation task and a match-theopposite task and the most commonly identified antonyms were compiled. In Study 2, we ...
متن کاملMulti-Level Analysis and Annotation of Arabic Corpora for Text-to-Sign Language MT
The Arabic language is morphologically rich and syntactically complex with many differences from European languages, and this creates a challenge when porting existing annotation tools to Arabic. In this paper, we present an ongoing effort in lexical semantic analysis and annotation of Modern Standard Arabic (MSA) text, a semi automatic annotation tool concerned with the morphologic, syntactic,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014